New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache #670
Conversation
Will the new options |
I guess I should upload the new docs. :-) |
Ok, updated the docs on the web site. |
Hi, I noticed some performance problems with large JSON files (many different keys) and the latest change. I tried using the |
If using many keys then it would be best if |
Yeah, with some large JSON files I have I am seeing a 10x running time versus version |
Do you have a file I can test against and what mode you are using? |
Sure, here is a link to one of the files (19MB): https://drive.google.com/file/d/10sZ-8ksIfgDsq-kfPJUECa9WffzXx8u5/view?usp=sharing In my environment this takes ~2.4 seconds with |
Thank you. What mode are you using? I'll dig into it tomorrow and get it resolved one way or another. |
Thank you! I was using the default ( |
Pushed a "cache-fix" branch. The primary fix was to have The secondary change was to increase the cache size. Being a cache it will only show any kind of improvement with repetitions of the keys in either the same file or in multiple files. Even then the number of different keys should be less than something like 100,000. Maybe lower like 20,000. It really depends on the situation and benchmarking with the option on and off would be worthwhile for any given situation. Anyway, please let me know what you think about the branch. |
The branch seems to be performing much better again (back around ~3 seconds). I tried a few different modes again and they are all much faster. I'm not really sure why that would be the case, but maybe I'm doing something wrong. Anyways this resolves the issue I was seeing. Thank you very much for taking a look at it! |
For that JSON and only parsing it one time I would not turn caching on. Each key is unique and there are around 700,000 keys. Far outside the recommended limits. Thanks for bringing the issue to my attention. |
Adds hash key caching as well as string caching. Both are options. Gives a 20% performance boost.